AMENDMENT 

Serial Number: 10/585,680 
Filing Date: July 10,2006 

Title: METHOD AND APPARATUS FOR PARTITIONING PROGRAMS TO BALANCE MEMORY LATENCY 

IN THE CLAIMS 

Please amend the claims as follows. 

1 . (Currently Amended) A method of compiling code, comprising: 
partitioning instructions in the code among a plurality of processors based on memory 

accessJatency associated with the instructions by partitioning memory access dependence chains . 

2. (Canceled) 

3. (Original) The method of Claim 1, wherein partitioning instructions comprises 
partitioning a memory access dependence chain into an upstream stage. 

4. (Currently Amended) The method of Claim 1, wherein partitioning instructions 
comprises partitioning a memory access dependence chain into an upstream stage by assigning a 
first number of desired upstream nodes to the upstream stage, and assigning instructions in the 
code fefon which the first number of desired upstream nodes are dependent en-to the upstream 
stage. 

5. (Original) The method of Claim 4, wherein the number of desired upstream nodes is 
the length of the memory access dependence chain divided by a pipelining degree. 

6. (Currently Amended) The method of Claim 4, further comprising: 
generating a new number of desired upstream nodes if a computed weight of the 

upstream stage exceeds a predetermined value; and 

assigning a first new number of desired upstream nodes to the upstream stage; and 
assigning instructions in the code feron which the first new number of desired upstream 

nodes are dependent en-to the upstream stage. 

7. (Original) The method of Claim 3, further comprising partitioning the memory access 
dependence chain into a downstream stage. 

8. (Currently Amended) The method of Claim 7, wherein partitioning the memory 
access dependence chain into the downstream stage comprises: 
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assigning a last number of desired downstream nodes to the downstream stage; and 
assigning instructions in the code which are dependent on the firstlast number of desired 
downstream nodes to the downstream stage. 

9. (Original) The method of Claim 8, wherein the number of desired downstream nodes 
is N*(d-l)/d, where N is a length of the memory access dependence chain, and d is a pipelining 
degree. 

10. (Original) The method of Claim 1, further comprising identifying instruction 
dependence information. 

1 1 . (Original) The method of Claim 1, further comprising constructing a memory access 
dependence graph. 

12. (Original) The method of Claim 1, further comprising: 
constructing a memory access dependence graph; and 

identifying a memory access dependence chain from the memory access dependence 

graph. 

13. (Currently Amended) An article of manufacture comprising a non-transitory 
machine accessible medium including sequences of instructions, the sequences of instructions 
including instructions which when executed cause the machine to perform: 

partitioning instructions in code into a plurality of pipeline stages to be executed in 
parallel bva meng a plurality of processors based on memory access latency associated with the 
instructions. 

14. (Currently Amended) The article of manufacture of Claim 13, further comprising 
instructions which when executed causes the machine to further perform constructing a memory 
access dependence graph. 

15. (Currently Amended) The article of manufacture of Claim 13, wherein partitioning 
instructions comprises partitioning a memory access dependence chain into an upstage stream by 
assigning a first number of desired upstream nodes to the upstream stage, and assigning 
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instructions in the code feron which the first number of desired upstream nodes are dependent en 
to the upstream stage. 

16. (Currently Amended) A code partitionin g analvsis uni t implemented on a processor , 
comprising: 

a dependence information unit to identify dependencies between instructions in code; and 
a code partitioning unit to partition instructions in the code into a plurality of pipeline 

stages to be executed bv affleng a plurality of processors based on memory access latency 

associated with the instructions. 



17. (Currently Amended) The apparatus of Claim 16, wherein the code partition unit 
comprises: 

a length unit to determine a number of d e sir e d lengths of upstream nodes from a memory 
access dependence chain to assign to an upstream stage; 

an assignment unit to assign a first number of desired l e ngths of upstream nodes to the 
upstream stage; 

a close up unit to assign instructions in the code for which the first number of desired 
length of upstream nodes are dependent en-to the upstream stage; and 

an evaluation unit to determine whether a computed weight of the upstream stage exceeds 
a predetermined value. 

18. (Currently amended) The apparatus of Claim J/746, wherein the length unit 
determines a new number of desired length of upstream nodes in response to the evaluation unit 
determining that the computed weight of the upstream stage exceeds the predetermined value. 

19. (Currently Amended) The apparatus of Claim 16, wherein the length unit determines 
a number of desir e d length of downstream nodes from the memory access dependence chain to 
assign to a downstream stage, the assignment unit assigns a first number of desired l e ngth of 
downstream nodes to the downstream stage, the close up unit assigns instructions in the code 
feron which are dependent on the first number of desired length of down stream nodes, and an 
evaluation unit to determine whether a computed weight of the downstream stage exceeds the 
predetermined value. 
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20. (Currently Amended) The apparatus of Claim 19, further comprising a balancing 
unit to assign remaining instructions to the upstream stage and the downstream stage in a manner 
that substantially balances computed weight. 



